Goto

Collaborating Authors

 node 2




An Adaptive Resonance Theory-based Topological Clustering Algorithm with a Self-Adjusting Vigilance Parameter

arXiv.org Machine Learning

Clustering in stationary and nonstationary settings, where data distributions remain static or evolve over time, requires models that can adapt to distributional shifts while preserving previously learned cluster structures. This paper proposes an Adaptive Resonance Theory (ART)-based topological clustering algorithm that autonomously adjusts its recalculation interval and vigilance threshold through a diversity-driven adaptation mechanism. This mechanism enables hyperparameter-free learning that maintains cluster stability and continuity in dynamic environments. Experiments on 24 real-world datasets demonstrate that the proposed algorithm outperforms state-of-the-art methods in both clustering performance and continual learning capability. These results highlight the effectiveness of the proposed parameter adaptation in mitigating catastrophic forgetting and maintaining consistent clustering in evolving data streams. Source code is available at https://github.com/Masuyama-lab/IDAT


Feature Learning for Interpretable, Performant Decision Trees Supplementary Material 1 Experiment Specification

Neural Information Processing Systems

Here we cover the full specification of the experiments. Some details were omitted from the main text. If there were separate training and test sets, they were combined before creating the random 10-fold split. All attributes are normalized to mean 0 and standard deviation 1. Additional details for each model type follow.



18997733ec258a9fcaf239cc55d53363-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Thanks to your rebuttal, I think I now understand your algorithm, and I think it is correct. But why did you present in Figure 2 algorithm 2 with CB and not TCB? The algorithm with CB does not work, and it is misleading to put CB in Figure 2. I would recommend changing this and putting TCB in the presentation of your algorithm. Also, please comment on the necessity of knowing L(u_1,...,u_n) (or rather an upper bound on this, and rewrite the Thm with an upper bound since it is not realistic to have truly this quantity available).


GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

arXiv.org Artificial Intelligence

This paper introduces GraphOmni, a comprehensive benchmark designed to evaluate the reasoning capabilities of LLMs on graph-theoretic tasks articulated in natural language. GraphOmni encompasses diverse graph types, serialization formats, and prompting schemes, significantly exceeding prior efforts in both scope and depth. Through extensive systematic evaluation, we identify critical interactions among these dimensions, demonstrating their substantial impact on model performance. Our experiments reveal that state-of-the-art models like Claude-3.5 and o4-mini consistently outperform other models, yet even these leading models exhibit substantial room for improvement. Performance variability is evident depending on the specific combinations of factors we considered, underscoring the necessity of comprehensive evaluations across these interconnected dimensions. Additionally, we observe distinct impacts of serialization and prompting strategies between open-source and closed-source models, encouraging the development of tailored approaches. Motivated by the findings, we also propose a reinforcement learning-inspired framework that adaptively selects the optimal factors influencing LLM reasoning capabilities. This flexible and extendable benchmark not only deepens our understanding of LLM performance on structured tasks but also provides a robust foundation for advancing research in LLM-based graph reasoning. The code and datasets are available at https://github.com/GAI-Community/GraphOmni.


Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game

arXiv.org Artificial Intelligence

Decentralized data refers to the distribution of data across multiple, often geographically dispersed locations or sources, rather than centralizing it at a single site, server, or storage location. This decentralization of data is becoming more common due to the proliferation of connected devices, edge computing, and privacy concerns. While decentralized data offers advantages in terms of data security, privacy, and accessibility, it poses significant challenges for the training of machine learning algorithms. The challenge of decentralised data is addressed through decentralised machine learning [1] [3] by enabling model training across multiple nodes without the need to centralise the data. Techniques such as federated learning [15] allow the data to remain on local devices, while only model updates are shared and aggregated, preserving privacy and reducing the risk of data breaches [36]. This approach not only increases data security, but also enables compliance with data protection regulations and improves scalability by utilising the computing power of numerous decentralised nodes. A particular challenge in these decentralized learning setups are domains with very different distributions in the individual nodes [11]. This problem is referred to as non-independent and identically distributed (non-iid) data [17] and concerns distribution differences in the labels of the data that can arise due to user behaviour, geographical differences, different levels of knowledge, socio-cultural differences or technical differences in the recording devices [20]. In medical use cases, the problem arises due to the large differences between the nodes that are also the data generators.


Thinking Forward and Backward: Effective Backward Planning with Large Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) have exhibited remarkable reasoning and planning capabilities. Most prior work in this area has used LLMs to reason through steps from an initial to a goal state or criterion, thereby effectively reasoning in a forward direction. Nonetheless, many planning problems exhibit an inherent asymmetry such that planning backward from the goal is significantly easier -- for example, if there are bottlenecks close to the goal. We take inspiration from this observation and demonstrate that this bias holds for LLM planning as well: planning performance in one direction correlates with the planning complexity of the problem in that direction. However, our experiments also reveal systematic biases which lead to poor planning in the backward direction. With this knowledge, we propose a backward planning algorithm for LLMs that first flips the problem and then plans forward in the flipped problem. This helps avoid the backward bias, generate more diverse candidate plans, and exploit asymmetries between the forward and backward directions in planning problems -- we find that combining planning in both directions with self-verification improves the overall planning success rates by 4-24% in three planning domains.


GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction

arXiv.org Artificial Intelligence

Graph-structured data is integral to many applications, prompting the development of various graph representation methods. Graph autoencoders (GAEs), in particular, reconstruct graph structures from node embeddings. Current GAE models primarily utilize self-correlation to represent graph structures and focus on node-level tasks, often overlooking multi-graph scenarios. Our theoretical analysis indicates that selfcorrelation generally falls short in accurately representing specific graph features such as islands, symmetrical structures, and directional edges, particularly in smaller or multiple graph contexts. To address these limitations, we introduce a cross-correlation mechanism that significantly enhances the GAE representational capabilities. Additionally, we propose the GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks and ensures robust structural reconstruction, through a mirrored encoding-decoding process. This model also tackles the challenge of representation bias during optimization by implementing a loss-balancing strategy. Both theoretical analysis and numerical evaluations demonstrate that our methodology significantly outperforms existing self-correlation-based GAEs in graph structure reconstruction.